Performability Optimization using Linear Bounds of Partially Observable Markov Decision Processes

نویسندگان

  • Kaustubh R. Joshi
  • Matti A. Hiltunen
  • William H. Sanders
چکیده

Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs) have been proposed as a framework for performability management. However, exact solution of even small POMDPs is very difficult because of their potentially infinite induced state spaces. In this paper, we present new lower bounds on the accumulated reward measure for MDPs and POMDPs. We describe how the bounds can be used in conjunction with heuristic search techniques in order to circumvent the state-space explosion problem in POMDPs. Our techniques can be used to choose actions that attempt to maximize performability during system recovery in self-healing systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Methods for Computing Bounds in Partially Observable Markov Decision Processes

Partially observable Markov decision processes (POMDPs) allow one to model complex dynamic decision or control problems that include both action outcome uncertainty and imperfect observability. The control problem is formulated as a dynamic optimization problem with a value function combining costs or rewards from multiple steps. In this paper we propose, analyse and test various incremental me...

متن کامل

Approximate Linear Programming for Constrained Partially Observable Markov Decision Processes

In many situations, it is desirable to optimize a sequence of decisions by maximizing a primary objective while respecting some constraints with respect to secondary objectives. Such problems can be naturally modeled as constrained partially observable Markov decision processes (CPOMDPs) when the environment is partially observable. In this work, we describe a technique based on approximate lin...

متن کامل

Solving Partially Observable Markov Decision Processes by Neural Networks

Partially Observable Markov Decision Processes POMDPs cope with sequential decision processes where an agent tries to maximize or minimize some reward without complete knowledge of the process. These models are of interest for quality control, machine maintenance, reinforcement learning, etc. More generally Monahan 99 has shown that many tasks in partially observable environments can be viewed ...

متن کامل

Producing efficient error-bounded solutions for transition independent decentralized mdps

There has been substantial progress on algorithms for single-agent sequential decision making using partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: error-bounds and fast convergence rates. Despite significant efforts, no algorithms for solving decentralized POMDPs benefit from these properties, leading ...

متن کامل

Geometry and Determinism of Optimal Stationary Control in Partially Observable Markov Decision Processes

It is well known that any finite state Markov decision process (MDP) has a deterministic memoryless policy that maximizes the discounted longterm expected reward. Hence for such MDPs the optimal control problem can be solved over the set of memoryless deterministic policies. In the case of partially observable Markov decision processes (POMDPs), where there is uncertainty about the world state,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005